Nyström Method with Kernel K-means++ Samples as Landmarks
نویسندگان
چکیده
We investigate, theoretically and empirically, the effectiveness of kernel K-means++ samples as landmarks in the Nyström method for low-rank approximation of kernel matrices. Previous empirical studies (Zhang et al., 2008; Kumar et al., 2012) observe that the landmarks obtained using (kernel) K-means clustering define a good lowrank approximation of kernel matrices. However, the existing work does not provide a theoretical guarantee on the approximation error for this approach to landmark selection. We close this gap and provide the first bound on the approximation error of the Nyström method with kernel Kmeans++ samples as landmarks. Moreover, for the frequently used Gaussian kernel we provide a theoretically sound motivation for performing Lloyd refinements of kernel K-means++ landmarks in the instance space. We substantiate our theoretical results empirically by comparing the approach to several state-of-the-art algorithms.
منابع مشابه
Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds
Kernel k-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear k-means clustering algorithm. However, kernel kmeans clustering is computationally expensive when the non-linear feature map is highdimensional and there are many input points. Kernel approximation, e.g., the Nyström method, has been applied in previous works to approx...
متن کاملFast DPP Sampling for Nystrom with Application to Kernel Methods
The Nyström method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nyström using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarante...
متن کاملRandomized Clustered Nystrom for Large-Scale Kernel Machines
The Nyström method has been popular for generating the low-rank approximation of kernel matrices that arise in many machine learning problems. The approximation quality of the Nyström method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we present a novel algorithm to compute the optimal Nyström low-approximation when the number of landm...
متن کاملThe Variational Nystrom method for large-scale spectral problems
Spectral methods for dimensionality reduction and clustering require solving an eigenproblem defined by a sparse affinity matrix. When this matrix is large, one seeks an approximate solution. The standard way to do this is the Nyström method, which first solves a small eigenproblem considering only a subset of landmark points, and then applies an out-of-sample formula to extrapolate the solutio...
متن کاملDensity-Weighted Nyström Method for Computing Large Kernel Eigensystems
The Nyström method is a well-known sampling-based technique for approximating the eigensystem of large kernel matrices. However, the chosen samples in the Nyström method are all assumed to be of equal importance, which deviates from the integral equation that defines the kernel eigenfunctions. Motivated by this observation, we extend the Nyström method to a more general, density-weighted versio...
متن کامل